{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "31f5b6fe-80cd-4ff1-87fa-a3fcae45ea63",
   "metadata": {},
   "source": [
    "# Working with Trajectories\n",
    "\n",
    "## Objective\n",
    "\n",
    "The objective of this tutorial is to show how to create QM input files using trajectory files. In this example you will see QMzyme can be used to generate multiple QMzymeModels from different trajectory snapshots. \n",
    "\n",
    "It is recommended to first visit the QM-only calculation and QM/xTB calculation workflows, since both workflows are utilized. You can find it [here for QM-only](https://qmzyme.readthedocs.io/en/latest/Examples/QM-only%20Calculation.html) and [here for QM/xTB](https://qmzyme.readthedocs.io/en/latest/Examples/QMxTB%20Calculation.html).\n",
    "\n",
    "This workflow allows you to:\n",
    "\n",
    "- Be able to extract specific frames from the trajectory (PQR and DCD) files.\n",
    "- Create QM input files from trajectory files\n",
    "\n",
    "In this specific example, we are using ketosteroid isomerase (KSI) as the model system. The structure for KSI is obtained from the PDB [1OH0](https://doi.org/10.2210/pdb1OH0/pdb) and MM-minimized prior to this tutorial. We will utilize it to create an input file for ORCA for geometry optimization and frequency calculation.\n",
    "\n",
    "## Classes used in this example:\n",
    "\n",
    "- [GenerateModel](https://qmzyme.readthedocs.io/en/latest/API/QMzyme.GenerateModel.html)\n",
    "- [QM_Method](https://qmzyme.readthedocs.io/en/latest/API/QMzyme.CalculateModel.html#qm-treatment)\n",
    "- [XTB_Method](https://qmzyme.readthedocs.io/en/latest/API/QMzyme.CalculateModel.html#xtb-treatment)\n",
    "- [CA_terminal TruncationScheme](https://qmzyme.readthedocs.io/en/latest/API/QMzyme.TruncationSchemes.html#QMzyme.TruncationSchemes.CA_terminal)\n",
    "- [DistanceCutoff SelectionScheme](https://qmzyme.readthedocs.io/en/latest/API/QMzyme.SelectionSchemes.html#QMzyme.SelectionSchemes.DistanceCutoff)\n",
    "\n",
    "## Required Files\n",
    "\n",
    "To start, you will need:\n",
    "\n",
    "- A fully prepped and protonated PDB.\n",
    "  \n",
    "---"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "3ad09bda-c63f-4ea3-a59b-c4aaac6fa3fc",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Here are the necesary imports for this tutorial!\n",
    "\n",
    "import QMzyme\n",
    "from QMzyme.SelectionSchemes import DistanceCutoff\n",
    "from QMzyme.data import PQR, DCD"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "55650e72-3e51-4972-be6b-2d54ff31972a",
   "metadata": {},
   "source": [
    "## Initialize Model and Define Regions\n",
    "\n",
    "To begin our example workflow, we need to start by initializing QMzymeModel using `GenerateModel()`.When loading PQR and DCD files, QMzyme will detect the number of frames "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "e317f761-96f0-45b7-8c79-984921f9e212",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Number of Frames:  1\n"
     ]
    }
   ],
   "source": [
    "# We first initilize QMzymeModel by loading structure file.\n",
    "# The initial pdb file should be prepared prior (hydrogens must be present).\n",
    "model = QMzyme.GenerateModel(PQR, DCD)\n",
    "\n",
    "# We can use this print statement to show how many frames are present.\n",
    "print(\"Number of Frames: \", model.universe.trajectory.n_frames)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7b9a400d-da47-4365-9093-bb3ed02074db",
   "metadata": {},
   "source": [
    "## QM-only: Loop over frames to generate QMzymeModels\n",
    "\n",
    "In this first scenario, you will generate five QM-only models using the DistanceCutoff selection scheme with a cutoff of 3 Angstroms. To achieve this, we need to first set the `QM_Method()`. From there, we can use the `for` loop to choose our frame range. In each of the snapshot, we can choose our QMzymeModel to be the structure snapshot at the given frame, which is used for `set_catalytic_center` and `set_region` to get our QM region. Afterwards, we can freeze C-alpha atoms with `set_fixed_atoms`, `truncate()`, and `write_input`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d43b9337-7790-40f4-a13f-43553d49e2bf",
   "metadata": {},
   "outputs": [],
   "source": [
    "# We first decide on the QM_Method for input file.\n",
    "qm_method = QMzyme.QM_Method(\n",
    "    basis_set='6-31G*', \n",
    "    functional='wB97X-D3', \n",
    "    qm_input='OPT FREQ', \n",
    "    program='orca'\n",
    ")\n",
    "\n",
    "# We use for loop to extract snapshops every 10 frames across 50 frames.\n",
    "for frame in range(0, 50, 10):\n",
    "    print('\\n====================================')\n",
    "    print(f'             Frame {frame}')\n",
    "    print('====================================')\n",
    "\n",
    "    # For each 10 frames, we extract structure information and set it as QMzymeModel.\n",
    "    m = QMzyme.GenerateModel(PQR, DCD, frame=frame)\n",
    "\n",
    "    # We set bound ligand as the catalytic center.\n",
    "    m.set_catalytic_center('resid 263')\n",
    "\n",
    "    # DistanceCutoff class is used to get 3 Angstrom region.\n",
    "    m.set_region(selection=DistanceCutoff, cutoff=3)\n",
    "\n",
    "    # We rename the region.\n",
    "    m.cutoff_3.rename(f'{m.cutoff_3.name}_frame_{frame}')\n",
    "\n",
    "    # We set C-alpha atoms as fixed atoms.\n",
    "    c_alpha_atoms = m.regions[-1].get_atoms(attribute='name', value='CA')\n",
    "    m.regions[-1].set_fixed_atoms(atoms=c_alpha_atoms)\n",
    "\n",
    "    # We assign QM_Method to the region.\n",
    "    qm_method.assign_to_region(region=m.regions[-1])\n",
    "\n",
    "    # We truncate the region using TerminalAlphaCarbon class.\n",
    "    m.truncate()\n",
    "\n",
    "    # And at last, we write the input!\n",
    "    m.write_input(filename=f\"{m.name}_cutoff3_frame{frame}_qm\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7e9a04cd-0415-47d4-bd32-cc00019bbb61",
   "metadata": {},
   "source": [
    "## QM/xTB: Loop over frames to generate QMzymeModels\n",
    "\n",
    "In this second scenario, you will generate five QM/xTB models with a QM region consisting of the ligand, EQU and the catalytic ASP103 residue and a large xTB region containing residues within 8 Angstroms of EQU. To achieve this, we need to first set `QM_Method()` and `XTB_Method()`. From there, we can use the `for` loop to choose our frame range. In each of the snapshot, we can choose our QMzymeModel to be the structure snapshot at the given frame, which is used for `set_catalytic_center` and `set_region` to get our QM region. Afterwards, we can `truncate()` and `write_input`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4890d8bc-e312-41c8-b1cd-3472a07af734",
   "metadata": {},
   "outputs": [],
   "source": [
    "# We first decide on the QM_Method for input file.\n",
    "qm_method = QMzyme.QM_Method(\n",
    "    basis_set='6-31G*', \n",
    "    functional='wB97X-D3', \n",
    "    qm_input='OPT FREQ', \n",
    "    program='orca'\n",
    ")\n",
    "# xTB method does not need any arguments.\n",
    "xtb_method = QMzyme.XTB_Method()\n",
    "\n",
    "# We use for loop to extract snapshops every 10 frames across 50 frames.\n",
    "for frame in range(0, 50, 10):\n",
    "    print('\\n====================================')\n",
    "    print(f'             Frame {frame}')\n",
    "    print('====================================')\n",
    "\n",
    "    # For each 10 frames, we extract structure information and set it as QMzymeModel.\n",
    "    m = QMzyme.GenerateModel(PQR, DCD, frame=frame)\n",
    "\n",
    "    # We set bound ligand as the catalytic center.\n",
    "    m.set_catalytic_center('resid 263')\n",
    "\n",
    "    # We specify which regions will be treated with QM_Method\n",
    "    m.set_region(name='qm_region', selection='resid 263 or resid 103')\n",
    "\n",
    "    # DistanceCutoff class is used to get 3 Angstrom region.\n",
    "    m.set_region(selection=DistanceCutoff, cutoff=8)\n",
    "\n",
    "    # We rename the region.\n",
    "    m.cutoff_8.rename(f'{m.cutoff_8.name}_frame_{frame}')\n",
    "\n",
    "    # We assign QM_Method and XTB_Method to respective regions.\n",
    "    qm_method.assign_to_region(region=m.qm_region)\n",
    "    xtb_method.assign_to_region(region=m.cutoff_8)\n",
    "\n",
    "    # We truncate the region using TerminalAlphaCarbon class.\n",
    "    m.truncate()\n",
    "\n",
    "    # And at last, we write the input!\n",
    "    m.write_input(filename=f\"{m.name}_cutoff8_frame{frame}_qmxtb\")"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}